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Hamiltonian simulation with nearly 
optimal dependence on all parameters 


Dominic W. Berry * * Andrew M. Childs h Robin Kothari ^ 


Abstract 

We present an algorithm for sparse Hamiltonian simulation whose complexity is optimal (up 
to log factors) as a function of all parameters of interest. Previous algorithms had optimal or 
near-optimal scaling in some parameters at the cost of poor scaling in others. Hamiltonian 
simulation via a quantum walk has optimal dependence on the sparsity at the expense of poor 
scaling in the allowed error. In contrast, an approach based on fractional-query simulation 
provides optimal scaling in the error at the expense of poor scaling in the sparsity. Here we 
combine the two approaches, achieving the best features of both. By implementing a linear 
combination of quantum walk steps with coefficients given by Bessel functions, our algorithm’s 
complexity (as measured by the number of queries and 2-qubit gates) is logarithmic in the 
inverse error, and nearly linear in the product r of the evolution time, the sparsity, and the 
magnitude of the largest entry of the Hamiltonian. Our dependence on the error is optimal, and 
we prove a new lower bound showing that no algorithm can have sublinear dependence on r. 


1 Introduction 

The problem of simulating the dynamics of quantum systems was the original motivation for quan¬ 
tum computers [19] and remains one of their major potential applications. Although classical 
algorithms for this problem are inefficient, a significant fraction of the world’s computing power 
today is spent in solving instances of this problem that arise in, e.g., quantum chemistry and ma¬ 
terials science [25, 26]. Furthermore, efficient classical algorithms for this problem are unlikely 
to exist: since the simulation problem is BQP-complete [20], an efficient classical algorithm for 
quantum simulation would imply an efficient classical algorithm for any problem with an efficient 
quantum algorithm (e.g., integer factorization [29]). 

The first explicit quantum simulation algorithm, due to Lloyd [24] , gave a method for simulating 
Hamiltonians that are sums of local interaction terms. Aharonov and Ta-Shma gave an efficient 
simulation algorithm for the more general class of sparse Hamiltonians [2], and much subsequent 
work has given improved simulations [5, 6, 7, 8, 11, 12, 16, 28, 30]. Sparse Hamiltonians include 
most physically realistic Hamiltonians as a special case (making these algorithms potentially useful 
for simulating real-world systems). In addition, sparse Hamiltonian simulation can be used to 
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design other quantum algorithms [13, 14, 21]. For example, it was used to convert the algorithm 
for evaluating a balanced binary NAND tree with n leaves [17] to the discrete-query model [14]. 

In the Hamiltonian simulation problem, we are given an n-qubit Hamiltonian H (a Hermitian 
matrix of size 2 n x 2 n ), an evolution time t, and a precision e > 0, and are asked to implement 
the unitary operation e~ lHt up to error at most e (as quantified by the diamond norm distance). 
That is, the task is to implement a unitary operation, rather than simply to generate [3] or convert 
[23] a quantum state. We say that H is d-sparse if it has at most d nonzero entries in any row. 
In the sparse Hamiltonian simulation problem, H is specified by a black box that takes input 
(j, 7) € [ 2 n ] x [d] (where [d] := {1,..., d}) and outputs the location and value of the 7th nonzero 
entry in the jth row of H. Specifically, as in [6], we assume access to an oracle Oh acting as 

0 H \j,k,z) = \j,k,z® H jk ) (1) 

for j, k € [2 n ] and bit strings z representing entries of H, and another oracle Of acting as 

0 F \j,£) = \jJ(j,£)), (2) 


where f(j, 7): [2 n ] x [d] —>• [2 n ] is a function giving the column index of the 7th nonzero element 
in row j. Note that the form of Op assumes that the locations of the nonzero entries of H can 
be computed in place. This is possible if we can efficiently compute both (j, 7) >->■ /(j, 7) and the 
reverse map (j, f(j, 7)) i —> 7, which holds in typical applications of sparse Hamiltonian simulation. 
Alternatively, if / provides the nonzero elements in order, we can compute the reverse map with 
only a log d overhead by binary search. 

At present, the best algorithms for sparse Hamiltonian simulation, in terms of query complexity 
(i.e., the number of queries made to the oracles) and number of 2-qubit gates used, are one based 
on a Szegedy quantum walk [6, 12] and another based on simulating an unconventional model of 
query complexity called the fractional-query model [7]. An algorithm with similar complexity to [7] 
is based on implementing a Taylor series of the exponential [8]. The quantum walk approach has 
query complexity 0(d\\H\\ max t / y/e), which is linear in both the sparsity d and the evolution time t. 
(Here ||id|| max denotes the largest entry of H in absolute value.) However, this approach has poor 
dependence on the allowed error e. In contrast, the fractional-query approach has query complexity 
Ol*r&gk) , where r := d||id|| max t. This approach gives exponentially better dependence on the 
error at the expense of quadratically worse dependence on the sparsity. Considering the fundamental 
importance of quantum simulation, it is desirable to have a method that achieves the best features 
of both approaches. In this work, we combine the two approaches, giving the following. 


Theorem 1. A d-sparse Hamiltonian H acting on n qubits can be simulated for time t within error 
e with 


O 


( log (r/e) \ 
v log log(r/e)/ 


( 3 ) 


queries and 


O 



+ log 5 / 2 (r/e)] 


log(r/6) \ 
log log (r/e)/ 


additional 2-qubit gates, where r := d||id|| max f. 


( 4 ) 


This result provides a strict improvement over the query complexity of [7, 8] , removing a factor 
of d in t, and thus providing near-linear instead of super quadratic dependence on d. 

We also prove a lower bound showing that any algorithm must use 14(r) queries. While a lower 
bound of 71(f) was known previously [5], our new lower bound shows that the complexity must 
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be at least linear in the product of the sparsity and the evolution time. Our proof is similar to a 
previous limitation on the ability of quantum computers to simulate non-sparse Hamiltonians [15]: 
by replacing each edge in the graph of the Hamiltonian by a complete bipartite graph Kd,d, we 
effectively boost the strength of the Hamiltonian by a factor of d at the cost of making the matrix 
less sparse by a factor of d. Combining this result with the error-dependent lower bound of [7], we 
find a lower bound as follows. 


Theorem 2. For any e,t > 0, integer d > 2, and fixed value of ||i7|| max , there exists a d-sparse 
Hamiltonian H such that simulating H for time t with precision e has query complexity 


n 



l°g(l/e) \ 
log log(l/e) / 


(5) 


Thus our result is near-optimal for the scaling in either r or e on its own. However, our upper 
bound (3) has a product, whereas the lower bound (5) has a sum. It remains an open question how 
to close the gap between these bounds. Intriguingly, a slight modification of our technique gives 
another algorithm with the following complexity. 


Theorem 3. For any a € (0,1], a d-sparse Hamiltonian H acting on n qubits can be simulated for 
time t within error e with query complexity 

O (r 1+ "/ 2 + t 1 ""/ 2 log(l/e)). (6) 


This result provides a nontrivial tradeoff between the parameters t, d, and e, and suggests that 
further improvements to such tradeoffs may be possible. 

We now informally describe the key idea behind our algorithms. For simplicity, suppose that the 
entries of the Hamiltonian are small, satisfying ||i/|| max < 1/d, and t = 1. Previous work on Hamil¬ 
tonian simulation [6, 12] has shown that using a constant number of queries, we can construct a 
unitary U whose top-left block (in some basis) is exactly e -* arcsm (^). Technical difficulties aside, the 
essential problem is to implement the unitary e~ lH given the ability to perform e~ l arcsm (^). While it 
is not clear how to express e~ lH as a product of easy-to-implement unitaries and e -* arcsm (#) ; it can 
be approximated by a linear combination of powers of e -* arcsm ( J H. Although such a decomposition 
may not seem natural, we show that nevertheless it leads to an efficient implementation. 

In the next section we present a more technical overview of this high-level idea. In Section 3 we 
analyze and prove the correctness of our algorithms. Section 4 proves the lower bound presented 
in Theorem 2 and we conclude with some discussion in Section 5. 


2 Overview of algorithms 

Our algorithm uses a Szegedy quantum walk as in [6, 12], but with a linear combination of dif¬ 
ferent numbers of steps. Such an operation can be implemented using the techniques that were 
developed to simulate the fractional-query model [7]. This allows us to introduce a desired phase 
more accurately than with the phase estimation approach of [6, 12], As in [7], we first implement 
the approximated evolution for some time interval with some amplitude and then use oblivious am¬ 
plitude amplification to make the implementation deterministic, facilitating simulations for longer 
times. In the rest of this section, we describe the approach in more detail. 

References [6, 12] define a quantum walk step U that depends on the Hamiltonian H to be 
simulated. In turn, this quantum walk step is based on a state preparation procedure that only 
requires one call to the sparse Hamiltonian oracle, avoiding the need to decompose H into a sum of 
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terms as in product-formula approaches. Two copies of the Hilbert space acted on by H are used. 
First, the initial state is in one of these Hilbert spaces. Then, the state preparation procedure is 
used to map the initial state onto the joint Hilbert space. This state preparation acts on the second 
copy of the Hilbert space, controlled by the state in the first Hilbert space. The quantum walk 
steps take place in this joint Hilbert space. Finally, the controlled state preparation is inverted to 
map the final state back to the first Hilbert space. 

In the controlled state preparation, each eigenstate of H is mapped onto a superposition of two 
eigenstates \n±) of the quantum walk step U . The precise definition of U is not needed here; for 
our application, it suffices to observe that the eigenvalues /jl± of U are related to the eigenvalues A 
of H via 

/i± = ±e ±^ rc sin(A/Xd), (7) 

where X > ||lL|| max is a parameter that can be increased to make the steps of the quantum walk 
closer to the identity. For small X/Xd, the steps of the quantum walk yield a phase factor that 
is nearly proportional to that for the Hamiltonian evolution. However, the phase deviates from 
the desired value since the function arcsinz/ is not precisely linear about v = 0. Also, there are 
two eigenvalues n±, and in previous approaches it was necessary to distinguish between these to 
approximate Hamiltonian evolution [6, 12]. In contrast, for the new technique we present here it is 
not necessary to distinguish the eigenspaces. 

An obvious way to increase the accuracy is to increase X above its minimum value of ||fL|| max . 
However, the number of steps of the quantum walk is 0(tXd), so increasing X results in a less 
efficient simulation. Another approach is to use phase estimation to correct the phase factor [6, 12], 
but this approach still gives polynomial dependence on 1/e. 

Instead, we propose using a superposition of steps of the quantum walk to effectively linearize 
the arcsin function. Specifically, rather than applying U, we apply 

k 

Vk-= E a m U m (8) 

m=—k 


for some coefficients a_fc,..., a^. We show that the coefficients can be chosen by considering the 
generating function for the Bessel function [1, 9.1.41], 


E 

m =—oo 



_ JXz/Xd 


(9) 


where the second equality follows from (7). Because the right-hand-side does not depend on whether 
the eigenvalue of U is fx + or /j_ , there is no need to distinguish the eigenspaces. Thus the ability 
to perform the operation 

OO 

E Jm(z)U m (10) 

m =—oo 

would allow us to exactly implement the evolution under H for time — z/Xd . Because of the minus 
sign, we will take z to be negative to obtain positive time. By truncating the sum in (10) to some 
finite range {— k ,..., k}, we obtain an expression in which each term can be performed using at 
most k queries. Because the Bessel function falls off exponentially for large |m|, we can obtain error 
at most e with a cutoff k that is only logarithmic in 1/e. 

A linear combination of unitaries (LCU) such as (8) can be implemented using the LCU Lemma 
(Lemma 4) described in the next section. The high-level intuition for the procedure is as follows. 
We prepare ancilla qubits in a superposition encoding the coefficients of the linear combination 
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and then perform the unitary operations of the linear combination in superposition, controlled by 
the ancilla. One could then obtain 14 by postselecting on an appropriate ancilla state. Instead, to 
obtain 14 deterministically, we apply the oblivious amplitude amplification procedure introduced 
in [7]. Rather than using 14 to implement evolution over the entire time, we break the time up 
into shorter time steps we call “segments” (named by analogy to the segments used in [7]) and use 
14 to achieve the time evolution for each segment. 

The complexity of our algorithm is the number of segments (tXd/\z\) times the complexity for 
each segment (k) times the number of steps needed for oblivious amplitude amplification (a). We 
have some freedom in choosing z, which controls the amount of evolution time simulated by each 
segment. To obtain near-linear dependence on the evolution time t, we choose 2 = 0(1). Then 
amplitude amplification requires 0(1) steps, and the number of segments needed is O(r), giving 
the linear factor in (3). The value of k needed to achieve overall error at most e is logarithmic in 
r/e, yielding the logarithmic factor in (3). 

An alternative approach is to use a larger segment that scales with r. Choosing z = — r“ for 
a € (0,1], we need k = 0(r“ + log(l/e)). Then we require 0(T l ~ a ) segments and 0(t Q//2 ) steps of 
amplitude amplification, giving the scaling presented in Theorem 3. 

3 Analysis of algorithms 

3.1 A quantum walk for any Hamiltonian 

We begin by reviewing the quantum walk defined in [7, 12]. Given a Hamiltonian H acting on 
C N (where N := 2 n ), the Hilbert space is expanded to C 2N <S> C 2N . First, an ancilla qubit in the 
state |0) is appended, which expands the space from C N to C 2N . Then the entire Hilbert space is 
duplicated, giving C 2N <8) C 2N . This is achieved using the isometry 

N—l 

T: = E E (liXil ® l&X&l) ® \vit) (ii) 

3=0 6e{0,l} 

with = 10)11) and 

'4 (i2) 

where X > ||F7|| max and Fj is the set of indices of nonzero elements in column j of H. Here we use 
the convention that the first subsystem is the original space, the next is the ancilla qubit, and the 
third and fourth subsystems are the duplicated space and duplicated ancilla qubit, respectively. 
This operation can be viewed as a controlled state preparation, creating state on input |j)|0). 
If the ancilla qubit is in the state 11), then |0)|1) is prepared. Starting with the initial space, the 
controlled state preparation is performed, and then steps of the quantum walk are applied using 
the unitary 

U := iS(2TT^ - I), (13) 

where S swaps the two registers (i.e., S\ji)\j 2 )\h)\h) = Ki>|^ 2 >|Ji>|J 2 > for all j 1 ,£ 1 € [IV], j 2 ,i 2 € 
{0,1}). Finally, the inverse state preparation T' is performed. For a successful simulation, the 
output should lie in the original space, and the ancilla should be returned to the state |0). 

Let A be the eigenvalue of H with eigenstate |A), and let v := A /Xd be the corresponding scaled 
eigenvalue for the quantum walk. The steps of the quantum walk U satisfy U\fi±) = /i±|h±) [12] 
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with 


\h±) ■= (T + i(i±ST) |A) 


(14) 

(15) 


H± := ±^l-u 2 + iu = ±e ±iaxcs in^ 

To apply the steps of the quantum walk to approximate Hamiltonian evolution, there are two 
challenges: we must handle both the |// + ) and |/z_) sectors, and correct the applied phase. In this 
work we are able to solve both these challenges at once by using a superposition of steps of the 
quantum walk. 

3.2 Linear combination of unitaries 

We now describe how to perform a linear combination of unitary operations. Given an M-tuple 
of unitary operations U = (U\,... ,Um), we quantify the complexity of implementing a linear 
combination of the U m s in terms of the number of invocations of 

M 

select(17) := \m)(m\ ® U m . (16) 

m— 1 

Such a result was previously given in [8, 22], Here we formalize that result and generalize to allow 
more steps of oblivious amplitude amplification. The overall result is as given in the following 
lemma. 

Lemma 4 (LCU Lemma). Let U = (U\,, Um) be unitary operations and let V = X)m=i a mU m 
be 5-close to a unitary. We can approximate V to within 0(5) using 0(a) select(C/) and select (17') 
operations and 0(Ma) additional 2-qubit gates, where a : = Yl,m=i l°m|- 

To prove this result, we first consider an operation that would give V with postselection, then 
apply oblivious amplitude amplification to achieve it deterministically. The operation that provides 
V with postselection is described in the following lemma. 

Lemma 5. Let U = (U\,... ,Um) be unitary operations acting on a Hilbert space H 2 , let V = 
Yjm=l a mUm, and let S > J2m=1 \ a m\ be a real number. Define a Hilbert space Hi to be a tensor 
product of a qubit and a subspace of dimension M, and let |<j) := 10)10) € H\. Then there exists a 
unitary operation W acting onH\®H 2 such that Z = ■jr(|?)(?|<8>H), with Z := PWP, P := |?)(sJ<8>II. 
The operation W can be applied with 0(1) select(C/) and select(?7^) operations and O(M) additional 
2-qubit gates. 

Proof. To perform W, we perform an operation that rotates the ancilla qubits from |<j) to the state 

+ (17) 

where a := Ylm=i \ a m\- This state is of dimension 2 M and can be prepared from state |<j) using 
O(M) operations (which is trivial for | m) encoded in unary). Next we perform the controlled oper¬ 
ation select(C/). Finally, inverting the preparation of |x) and projecting onto |?) would effectively 
project the ancilla onto |x). Then the unnormalized operation on H 2 is V/s, corresponding to Z. 
The action of applying the unitary operation to prepare |%), the controlled operation select(U), 
and the inverse preparation gives the desired operation W. □ 
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Next we provide a multi-step version of robust amplitude amplification, generalizing the single- 
step version presented in [8]. In this lemma, and throughout this paper, ||-|| denotes the spectral 
norm. 


Lemma 6 (Robust oblivious amplitude amplification). Let W be a unitary matrix acting on R\®R 2 
and let P be the projector onto the subspace whose first register is |?) := 10)10) G Hi, i.e., P := 
|?)(?| <8>I- Furthermore let Z := PWP satisfy Z = ^(|?)(?| <8> V), where V is 6-close to a unitary 
matrix and sin( 2 ( 2 7+i) ) = i for some i € N, and let R := — W(I — 2P)W^{1 — 2P). Then 

\\PR e WP-(\g) (?|®V')|| = 0(6). (18) 

Proof. We start by considering a single iteration, as in [8]. Then we have 

RWP = -IT (I - 2 P)W *(I - 2P)WP 

= - WP + 2WP + 2 PWP - 4WPW*PWP 

= WP + 2Z - AWZ ] Z. (19) 

Multiplying by P on the left gives 

PRWP = -PIT (I - 2 P)W* (I - 2 P)WP = 3Z — AZZ ] Z, (20) 


which matches the expression in [8]. 

The general solution after m iterations is 


R m WP = (WP - Z) 


r 2m+1 (Vi - z^z) , (-i) m T 2m+ i(Vzfz) 


Vi - zfz 


+ z 


VzTz 


( 21 ) 


where T 2m +i are Chebyshev polynomials of the first kind. (Because Chebyshev polynomials for 
odd order only include odd powers, no square roots appear when (21) is expanded.) 

We establish (21) by induction. First note that it holds for m = 0, because T\(x) = x, so the 
right-hand side evaluates to WP. Next assume that it holds for a given m. It is straightforward to 
show that 


R(WP - Z) = (WP - Z)( 1 - 2Z^Z) + 2Z(l - Z^Z) and 
RZ = (WP - Z)(-2Z*Z) + Z( 1 - 2 Z ] Z). 


(WP - z) T 2 m + i(Vl-Z^Z) + z (-irT 2m+ i(Vzfz) 


Vi - z*z 


Hence, multiplying both sides of (21) by P, we get 

R m+1 WP = R 

= (WP - Z) 

+ Z 




( i - 2Z , _ 2z , z (-i rr^rm 

Vi - z\z 4zXz 

2(1 - z t z) + (1 _ 2Z t z) (-l)"‘W^ZtZ) ~ 

y/l — Z^Z \JZ^Z 


To progress further, we use the relation [1, 22.3.15] 

T 2m -\-i(x) = cos[(2m + 1) arccosx] = (—l) m sin[(2m + 1) arcsinx]. 


( 22 ) 


(23) 


(24) 
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Using this we find that, with x = sin0 


(1 - 2x 2 ) 


T 2m+1 (Vl^) _ 2x2 (-l) m T 2m+1 (x) 


v 7 1 — 


_ 2 a 2 a ,cos[(2m + 1)0} 


= (cos 7 0 — sin 7 0)- 


cos 9 


— 2 sin 6 sin[(2m + 1)0] 


cos(20) cos[(2m + 1)0] — sin(20) sin[(2m + 1)0] 


cos 0 


cos[(2m + 3)0] 
cos 0 

T 2m +3 (\/l — x 2 ) 


Vi —. 


(25) 


Next put x = cos <f> to obtain 
2(1 - x 2 ) 


2x7Wi(Vl - x 2 ) 2 (-l) m T 2m+ i(x) 


v / U r 


x^ 


+ (1 — 2x 7 )- 


= 2 (sin 7 </>) 

= (-l) 


2 . (—l) m sin[(2m + 1)0] 2 , ■ 2 , ( — l) m c°s[(2m + 1)0] 


— (cos ~ sin <£>)- 

sin (j) cos ( 

sin(2 (f>) sin[(2m + 1)<^>] — cos(20) cos[(2m + 1 )<f>\ 


cos 4> 


m+i cos[(2m + 3)0] 


cos 4> 


= (-l) 
= (-l) 


Using these relations, (23) simplifies to 


m-\-l -^2m+3 


(x) 


R'^WP = (1 WP - + z (-ir + 1 T 2 m+ 3 (XzTz) 


Vi - ztz 


Vztz 


(26) 


(27) 


Hence we find that, if (21) is correct for non-negative integer m, it is correct for m + 1. Hence it 
is correct for all non-negative integers m by induction. 

Thus we find that by multiplying on the left by P, we obtain 


PR m WP = Z 


(-i) m r 2m+1 (\/ztz) 


\fzXz 

Then, in the case that 5 = 0, i.e., if V is equal to a unitary V, we would have 

s = J(k>fci®n 

Then Z^Z = (|?)(?| <8>II)/s 2 , and we get 

,(-l) m T 2m+1 (\/ZtZ) 1.. .. ( l) m P 2m _|_ 1 (1/s) 


(28) 


(29) 


Z- 


yftfZ 


= -sm\®v)- 


s . 1/s 


= (k)<?l®^)(-irr 2 ro+ i(i/ s ) 

= (|?)(?| <S> V ) sin[(2m + 1) arcsin(l/s)]. 


( 30 ) 


































In the case m = i, we can use sin( 2 ( 2 i+i) ) = 7 to obtain sin[(2£+ 1) arcsin(l/s)] = 1, which implies 


PR £ WP = |?)(?| <g> V. 


(31) 


Next, consider the case where V is only <5-close to being unitary. Let us define 

A := \JvW — I. 

We immediately obtain ||A|| = 0(5), and 

-(|?)(?|® A) = VtfZ -~(\g)(g\®I). 
s s 

We then get 

^(-lYT^iVzfZ) l n w ,^(-l)^ +1 ((I + A)/ S ) 

z 7 Wz “ * (k)(?l 0 } (i + a)7s 


= (kM®^) 


(-1/T 2W ((I + A)/ S ) 
1 +A 


Using (—l) e T 2 £+i(x) = sin[(2£ + l)arcsinx] and (—l)^T 2 £ + i(l/,s) = 1, we obtain 

||(-1) £ T 2£+1 ((I + A)/ s) - III = 0(£ 2 5 2 /s 2 ). 

Since t = 0(s), £ 2 /s 2 = 0(1), which implies 

||(-l) £ T 2m ((I + A)/s) - I|| = 0(5 2 ). 

The contribution to the error from I + A is 0(5), so we have 

(~iyT n+1 (Vtfz) 


z- 


Vzfz 

Using (28) we then get (18) as required. 


-(k)(?l®n 


= 0(5). 


(32) 

(33) 

(34) 

(35) 

(36) 

(37) 

□ 


Lemma 6 is in terms of the spectral norm distance, but the diamond norm distance is at most 
a constant factor larger. The specific result, proven in the Appendix, is as follows. 


Lemma 7. Let U, V be operators satisfying ||17|| < 1 and ||U|| < 1, and let || • || 0 denote the 
diamond norm. Then \\U — I || 0 < 2|| U — U||. Furthermore, ifV is a quantum channel with Kraus 
decomposition V(p) = V pV^ + VjPVj an d M(p) = UpU^, then || U — V||o < 4|| U — U||. 

The LCU Lemma follows by combining Lemma 5 and Lemma 6 . 


Proof of Lemma f. Using Lemma 5 we can implement the operation W required for Lemma 6 using 
0(1) select(17) and select(LU) operations and O(M) additional 2-qubit gates. We can choose s > a 
such that si n ( 2 ( 2 7 + i) ) = 7 f° r some i € N. Then Lemma 6 shows that V can be approximated to 
within 0(5) using 0(1) applications of W and the projection P. Since i = 0(s) = 0(a ), the total 
number of select(17) and select(17') operations is 0(a) and the number of additional 2-qubit gates 
is O(Ma). □ 
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3.3 Main algorithm 

The main problem with applying the quantum walk as presented in [ 6 , 12] is that arcsin v is a 
nonlinear function of u, so an imprecise phase is introduced. To solve this, we use a superposition 
of different numbers of applications of U. Define Vj~ as in ( 8 ), where the choice of {a m }^ n= _ k is 
considered below. The eigenvalues of V k corresponding to the eigenvalues /j,± of U are 

k 

:= 'y ^ • (38) 

m=—k 


In general j-i±.k can depend on ±. However, we will choose a m satisfying a_ m = (—l) m a m , which 
yields n±.k independent of ±. 

To see how to choose the coefficients a m , solve (15) for v to give 


i 


v = — 


2 



(39) 


This implies that, for any z, 



This corresponds to the standard generating function for the Bessel function [ 1 , 9.1.41], so 


(40) 


e iuz 


exp 


z 

2 





(41) 


Thus we can take a m ~ J m (z). Because there are efficient classical algorithms to calculate Bessel 
functions, the circuit to prepare lx*,) can be designed efficiently. Note that for large m, we have 
\Jm(z)\ ~ -^j\z/2\ m [1, 9.3.1], so the values of a m are similar to the coefficients in the expansion of 
the exponential function. Thus the segments used here are analogous to the segments used in [ 8 ]. 

To determine the complexity of this approach, we primarily need to bound the error in approx¬ 
imating e luz . To optimize the result, we use the coefficients 




Jm{z) 

z-=-kW 


(42) 


We make this choice because the most accurate results are obtained when the values of a m sum to 
1. Note also that this yields the symmetry a_ m = (—l) m a m , because J- m (z ) = (—1 ) m J m (z) [1, 
9.1.5]. The sum of J m (z) over all integers m is equal to 1 (which can be shown by putting t = 1 in 
[1, 9.1.41]), but because k is finite, we normalize the values as in (42). With this choice, we have 
the following error bound, proved in the Appendix. 


Lemma 8. With the values a m as in (^2), for \z\ < k we have 


\\Vk~Voo 


n\H\\ (z/ 2 ) k+1 \ 

V Ml kl ) ' 


(43) 


Note that is the exact unitary operation desired. We now determine the query complexity 
of this approach. In fact, we prove a result that is slightly tighter than the query complexity stated 
in Theorem 1. 
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Lemma 9. A d-sparse Hamiltonian H acting on n qubits can be simulated for time t within error e 
with complexity (quantified by the number of 2-qubit operations and controlled-U and controlled-U t 
operations ) 


( log(||ff||t/ e ) \ 
V loglog(||Uj|f/e)/ 


(44) 


Proof. Our main goal is to determine the value of k needed to bound the error by e. This depends 
on the length of time for the segments, which we can adjust by choosing the value of z. We wish 
to perform each step deterministically with one step of oblivious amplitude amplification, so we 
should have s = 2 in Lemma 6 . Using the values of a m given in (42), this means that we should 
take z = 0(1), and for concreteness z = —1/2 yields a < 2, so we can take s = 2. Then, using 
Lemma 4 with U m = U m and V = 14, we can approximate the operation 14 to within 0(5). 

Given an allowable error in a segment of 5 > 0, let us take 


( logdl^ll /Xd5) \ 
\loglog(\\H\\/Xd5)J’ 


(45) 


Then, using Lemma 8 and the inequality k\ > (k/e) k , it is straightforward to show that the error in 
each segment is no more than 6. Using Lemma 7, this bound on the error in terms of the spectral 
norm distance implies a bound on the diamond norm distance that is at most a constant factor 
larger. For the total error to be no more than e, the value of 6 can be no more than e divided by 
the number of segments. The number of segments is 0(tXd), which gives 


( iog(l|ff||*A) \ 

Vloglog(||i7||f/e)/ ' 


(46) 


Using Lemma 4, the complexity of each segment is 0{k ) since a select(U) operation can be 
implemented with complexity 0(M), and M = 2fc+l. It is straightforward to apply select(U) using 
0(k ) controlled-U and controlled-U^ operations. If | m) is encoded in unary, then each controlled 
operation may be just controlled on one qubit of | m). 

The number of segments required is 0(tXd). It is most efficient to take the minimum value of 
X, which is ||-U|| max . Because each segment uses 0(k) controlled-U and controlled-U^ operations, 
as well as O(M) = 0(k ) additional 2 -qubit gates, the complexity for the simulation over time t is 
Ofirk), Using the value of k from (46) gives the overall complexity stated in (44). □ 


Next we determine the gate complexity of this approach. Again we give a slightly tighter result 
than presented in Theorem 1. 


Lemma 10. A d-sparse Hamiltonian H acting on n qubits can be simulated for time t within error 
e using 


O r[n + F(log(\\H\\t/e))] 


log(||U||f/e) \ 
log log(\\H\\t/e)J 


(47) 


2-qubit gates, where F(m) is the complexity of performing elementary functions with m bits. 


Proof. To obtain the gate complexity, we need to consider the procedure for performing the step U 
in detail. We can perform T by first performing log d Hadamard gates to prepare the superposition 
state 

d-1 

^=£K)l°}. (48) 

Vd e=Q 
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Here we take d to be a power of 2 without loss of generality. (The value of d can always be rounded 
up to the nearest power of two.) Then the oracle Of (from (2)) can be used to produce the state 


1 

75 


£ I0lo)- 


(49) 


A call to the oracle Oh for the value of an element of the Hamiltonian (from (1)) then gives the 
value of Hjt in an ancilla space. Another ancilla qubit is rotated from |0) to 




(50) 


based on the value of Hj£. Then inverting the oracle Oh erases the value of Hjt from the ancilla 
space. Note that there is a sign ambiguity for the square root when H ;J ( takes negative real values. 
This is addressed in [6] and does not affect the complexity. 

To perform the step U, we also require the swap operation S, which has complexity 0(n ) due 
to the number of qubits. The gate complexity is 0(n ) from S, plus O(logd) = O(n) from the 
Hadamard gates, plus the complexity of performing the rotations to obtain the state (50). The 
complexity of the rotations depends on the number of bits of precision used for the entries of H. 
To obtain overall error O(e), the number of bits must be log(||i/j|f/e). To determine the rotations 
needed, we must also compute a square root and trigonometric functions on the output of the oracle. 
If these functions can be computed with complexity F(m) for m-bit inputs, the contribution to the 
overall complexity is .F(log(||if ||t/e)). 

In Lemma 9 the complexity is quantified in terms of the number of controlled-!/ and controlled- 
U t operations, so to obtain the overall gate complexity we just need to multiply that complexity 
by the cost of U. There is also a cost in terms of additional 2-qubit gates in Lemma 9, but 
that is smaller than the gate cost of performing U. Therefore, the gate complexity is equal to 
the complexity from Lemma 9 times 0(n + T’(log(||iL||t/e))), which gives a gate complexity as in 
(47). □ 

This result depends on the complexity of elementary functions, F(m ), needed to calculate 
the rotations. Using advanced techniques, F(m) may be made close to linear in m [10], though 
such advanced techniques only give an improvement for extremely high precision. Using simple 
techniques based on Taylor series and long multiplication, F(m) = 0(m 5//2 ). 

The classical complexity of determining the coefficients {a m }^ n= _ k is also potentially significant. 
A set of values of the Bessel function can be efficiently computed using Miller’s recurrence algorithm 
[9, 27]. The complexity scales as k (the number of entries) times log(||Hj|f/e) (the bits of precision 
needed for each J m (z)). This is no larger than the quantum gate complexity. 

Note that the gate complexity in Lemma 10 depends linearly on n, whereas the query complexity 
in Theorem 1 does not. This is because performing an operation on a target state with n qubits 
must require at least 0(n) gates. In contrast, the number of queries need not scale with n, because 
the queries are used to determine which gates to perform. There is an implicit complexity of H(n) 
for the queries, because the input to a query is of size at least n. 

The proof of Theorem 1 then follows immediately. 


Proof of Theorem 1. The implementation of U uses 0(1) oracle calls, which means that the query 
complexity is the same as the number of controlled applications of U. Noting that \\H\\ < d||-ffj| max , 
Lemma 9 implies the query complexity in Theorem 1, and Lemma 10 with F(m ) = 0(m 5 / 2 ) implies 
the gate complexity in Theorem 1. □ 
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3.4 A tradeoff between r and e 

The alternative algorithm characterized by Theorem 3 uses larger segments with z oc — r a for 
a E (0,1]. The case a = 0 corresponds to the case considered above, whereas a = 1 corresponds to 
a single segment. The analysis of this section assumes a > 0. 

To analyze this algorithm, we first need to bound the absolute sum of Bessel functions. 

Lemma 11. The quantity 

OO 

S ( z ) := \ J m( z )\ (51) 

m=—oo 

is 0 (a/R). 

We prove this in the Appendix. Using the robust version of amplitude amplification given in 
Lemma 6 , we obtain the bound in Theorem 3. 

Proof of Theorem 3. Using Lemma 8, Stirling’s formula, and the fact that ||14|| < d||-ff|| m ax < Xd, 
we find that for \z\ < k the error is bounded as 

\\V k - Uooll = 0{(e/k) k (z/2) k+1 ). (52) 

By Lemma 6 , the error in a segment after amplitude amplification is of the same order. Therefore, 
to ensure that the error in a segment is at most <5, it suffices to take 

k = 0(\z\ + log(l/5)) = 0{r a + log(l/5)). (53) 

With this value of k, we have X)m=—fc ^m{ z ) = 1 + 0(5), so Lemma 11 gives 

k 

Y \a m \ = 0(S(z)) = 0(y/R)- (54) 

m=—k 

This corresponds to the number of steps of oblivious amplitude amplification. The overall com¬ 
plexity is therefore 0(ky/\z\) = 0{kr a / 2 ) for a single segment. 

The number of segments is t/\z\ oc t 1 ”". This means that the complexity is 0(kT 1 ~ a ^ 2 ). 
The value of k also depends on the number of segments. We can take 6 = e/r l ~ a , which gives 
k = 0(r“ + log(l/e)), implying the result in Theorem 3. □ 

In this proof we have ignored logr in comparison to r“, which would not be valid for a = 0. For 
the gate complexity, we again have a multiplying factor of n + F(log(\\H\\t/e)), yielding a number 
of gates scaling as 

0([n + F(\og(\\H\\t/ e))] [r 1+ “ /2 + t 1_q/2 log(l/e)]). (55) 

4 Lower bound 

We now present a lower bound showing that the dependence of our algorithm on r := d||14||max^ 
is nearly optimal (and that the dependence of [ 6 ] on r is optimal). The main idea of the proof is 
the same as in Theorem 3 of [15], but we slightly adapt that argument to let t vary independently 
of d. Note that this is stronger than proving separate lower bounds of Q(t) and 11(4), since that 
would only show a lower bound of Q(d + t), which is weaker than our Tt(td) lower bound. 
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Lemma 12. For any positive integer d and any t > 0, there exists a 2d-sparse Hamiltonian H with 
11 H 11max = 0(1) such that simulating H with constant precision for time t requires Q(td) queries. 

Proof. Similarly to the f l(t) lower bound from [5], we construct a sparse Hamiltonian whose dy¬ 
namics compute the parity of a bit string, and we use the fact that at least N/2 quantum queries 
are needed to compute the parity of N bits [4, 18]. 

First consider a Hamiltonian H\ whose graph is a path with N + 1 vertices. (Here the graph 
of H has a vertex for each basis state and an edge between two vertices if the corresponding entry 
of H is nonzero.) The Hamiltonian acts on vectors |i) with i € {0,..., N} and has nonzero matrix 
elements 


(i - l\Hi\i) = (i\H\\i - 1) = y/i(N - i + 1) (56) 

for i € [ N ]. Simulating H\ for time 7 t /2 starting with the state |0) gives the state | N) (i.e., 

g-ii?! vr/2 1 0 ) = 

Next, consider a Hamiltonian H 2 generated from a string x € {0,1}^ as in [5]. H 2 acts on 
vertices | i,j) with i € {0,... , N} and j G {0,1} and has nonzero matrix elements 

(i - l,j\H 2 \i,j ffi Xi) = (i,j © Xi\H 2 \i - 1 ,j) = y/i{N -i + 1) (57) 

for all i G [ N ] and j € {0,1}. By construction, 10,0) is connected to \i,j) if and only if j = aqffi- • •© 

Xi. In particular, |0, 0) is connected to | N, x\ ffi • • • ffi xn), and determining whether it is connected 
to \N, 0) or \N,1) determines the parity of x. The graph of H 2 consists of two disjoint paths, 
one containing |0, 0) and | N, x\ ffi • • • ffi xn)- Thus we have e~ lH27T ^ 2 \Q, 0) = \N, x\ ffi • • • ffi xn), so 
evolution for time ir/2 computes the parity of x. 

Finally, we construct the Hamiltonian H claimed in the lemma. As before, H is generated from 
a string x € {0,1}^. H acts on vertices | i,j,£) with i G {0, ... ,N}, j G {0,1}, and t G [d]. The 
nonzero entries of H are 

(i - l,j,£\H\i,j ffi Xi,£') = ( i,j ffi Xi,£'\H\i - 1 ,j,£) = \/i(N - i + l)/N (58) 

for all i G [N], j G {0,1}, and £,£' € [d]. The graph of H is similar to that of H 2 , except that for 
each vertex in H 2 , there are now d copies of it in H. Each vertex is connected to all d copies of 
its neighboring vertices, so the graph has maximum degree 2d. Observe that, having divided the 
matrix elements by N, we have ||id|| m ax = 0(1). 

Now we simulate the Hamiltonian starting from the state |0, 0, *), where | i, j, *} := |b 3i -0 

denotes a uniform superposition over the third register. The subspace span{|i, j, *) : * 6 (0,..., N}, 
j € {0,1}} is an invariant subspace of H. Since the initial state lies in this subspace, the quantum 
walk remains in this subspace. The nonzero matrix elements of H in this invariant subspace are 

(i - l,j,*\H\i,j ffi Xi,*) = ( i,j ffi Xi,*\H\i - 1 ,j,*} = dyJi(N - i + 1)/IV, (59) 

so we have e~ lHt \0, 0, *) = | N, x\ ffi ■ ■ • ffi xn , *) for t = Nn/2d. Since this determines the parity of 
x, we find a lower bound of £l(N) = Q(td) as claimed. □ 

It is now straightforward to use this result to prove Theorem 2. 

Proof of Theorem 2. We choose one of two Hamiltonians depending on whether the first or second 
term in (5) is larger. If r is larger, then we use Lemma 12. The value of d used in Lemma 12 is 
denoted dl here, to distinguish it from the d given in Theorem 2. Taking d! = |_c7/2j, we ensure 
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that d' is a positive integer, because d > 2. Then Lemma 12 shows that there is a 2d / - S p arse 
Hamiltonian; given this value of d', this Hamiltonian is also d-sparse, as required for Theorem 2. 

For Theorem 2, we are also given a required value for ||iL|| max . The Hamiltonian used in 
Lemma 12 has ||iL|| max = 0(1)- By multiplying that Hamiltonian by a scaling factor, we obtain a 
Hamiltonian with the required value of ||iL|| max . Dividing the time used in Lemma 12 by the same 
factor, the simulation requires time H(r) for constant precision. In Theorem 2 we require precision 
e, which can only increase the complexity. 

In the case where the second term is larger, we use Theorem 6.1 of [7]. There it is shown that 
performing a simulation of a 2-sparse Hamiltonian with precision e and ||iL|| max t = 0(1) requires 


n 


( l°g(l/e) \ 
\ log log(l/e) / 


(60) 


queries. Because d > 2, this Hamiltonian is also d-sparse. As using larger values of ||iL|| max f can 
only increase the complexity, we also have this lower bound in the more general case. Therefore, 
regardless of whether the first or second term in (5) is larger, this expression provides a lower bound 
on the complexity. □ 


It is also possible to combine our lower bound with the lower bound of [7] to obtain a combined 
lower bound in terms of d, t, and e, that is stronger than Theorem 2. This yields a lower bound 
of fl(JV) for any N that satisfies e < ^|sin(td/lV)| Ar . Note that when e is a constant, we recover 
Lemma 12 and when td is constant, we recover (60). However, for intermediate values this lower 
bound can be strictly larger than the expression in Theorem 2. 


5 Conclusion 

Our technique for Hamiltonian simulation combines ideas from quantum walks and fractional- 
query simulation to provide improved performance over both previous techniques. As a result, it 
provides near-optimal scaling with respect to all parameters of interest. In particular, the scaling 
is only slightly superlinear in r = d||id|| max f, whereas we have proven that linear scaling is optimal. 
Furthermore, the method has query complexity sublogarithmic in the allowed error, which was 
proven to be optimal in [7]. 

Nevertheless, there is still a gap between the complexity of our algorithm and the lower bound 
in (5), as they involve different tradeoffs between the parameters r and e. It remains open whether 
the performance can be further improved, perhaps to give performance similar to (5), although as 
observed at the end of Section 4, we can rule out scaling strictly as in (5). 

Our technique can potentially be used for the more general task of operation conversion, in 
which we use one quantum operation to implement another. In our work, we convert a step of 
a quantum walk to Hamiltonian evolution, whereas in [21] the task is to convert Hamiltonian 
evolution to an inverse. One approach to operation conversion is to use phase estimation. Here we 
have shown that a superposition of operations can provide far better performance. 
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A Proofs of technical lemmas 


We now present proofs of some of the more technical results. 

Proof of Lemma 7. Consider two operators U and V acting on a pure state | if). For the following 
analysis we define 


4> := arg (tp\V"^U\if), 

N V := 1/ \J('ip\U^U\fj), 

N V := l/y]^\VW\^), 

N± := ^2±2N v N v \{f;\VW\if)\. 


Then we define a basis 


In terms of this basis, we have 


lx±> : = 


N V U\xl>) ± e^N v V\if) 


N+ 


1 


U W) =—( N +\X+) + N -\X-)), 

V W) = ^( N +\x+) - N-\x-)), 


so 


- V\il>){il>\V* = 
The eigenvalues of this matrix are 

A± = ^ 


1 

Nl 

N+NJ 

1 


-N+NJ 


N + N _ 

Nf 

41V2 

-N+N- 

Nf 


(if\tfu\ip) - {ip\vW\ip) ± + (^\VW\if)) 2 - 4|<-0|Wt[/|-0)| 2 


(61) 

(62) 

(63) 

(64) 

(65) 

( 66 ) 

(67) 

( 68 ) 


(69) 


The square root in this expression must be at least as large as \(iI’\U^U\iIj) — so A_ < 0 

and A+ > 0. Thus the trace norm (i.e., the Schatten 1-norm, denoted ||-||i) is 

mmrf -vwmv% = \\ + \ + \\-^ 

= y/((iJ,\WU\il>) + - 4|(^|Ft[/|^)|2 . (70) 

Next, from the definition of the spectral norm, 

wu-v\\ 2 >mu-v)\u-vm 

= {if\u^u\tf) + (v>|uV|v>) - {i)\v^u\if) - {if\uW\ii>) 

> (il>\rfu\il>) + (il>\V*V\il>) - 2 | (ip\V^U\i/j)\. ( 71 ) 


Using this inequality with the expression for the trace norm gives 

= {( 2 f\U^U\fj) + {if\vW\if) - 2|(^|Ut[/|^)|)((^|C/tt/|^) + {if\vW\if) + 2\{-if\V^U\if)\) 

< II U - W|| 2 ((-0|C/ + t/|^> + (^|uV|^) + 2| U\ip)\). (72) 
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Provided ||C/|| < 1 and ||H|| < 1, this yields 

mmrf - vw>)(^wi < 4|| u - yf, (73) 


so 


wuwmrf - Vtymv'lh < 2 ||U - V\\. (74) 

Given a mixed state p = strong convexity implies that 

\\UpU* - VpV% < '£p j \m j )('l’j\u' - |W||i 

3 

<2^p,\\U-V\\ 

3 

= 2||t7 — V||. (75) 

Similarly, tensoring with the identity gives 

||(l7®I) / !j(l7 t ®I) - (V ® I)p(F f <g> I) ||i < 2\\U ®I-V ®I\\ 

= 2\\U-V\\. (76) 

Hence, maximizing over p, the diamond norm satisfies 

\\U -V\\ <> <2\\U -V\\. (77) 

Now consider the case where U is some desired unitary, and V = Vo is an operation that happens 
if a measurement succeeds, with other operations Vj occurring for other measurement outcomes. 
That is, the overall channel is 

C(p) = VpVi + Y i V j pVj (78) 

3 

where V^V + V ^Vj = I. The trace is bounded by 

Tr(l VpV*) = Ti :(UpU*) - Tr(UpU f - V pV ] ) 

> 1- \\Uprf -VpV^lh 

>1-2\\U-V\\. (79) 

Hence the trace of the remaining part is bounded as 


Tr 



= Tr(C(p)) - TV(V>W) 

<1-(1-2||C/-H||) 

< 2\\U — V\\. 


For non-negative Hermitian operators, the trace norm is equal to the trace, so 


( 80 ) 


E v n> v i 


< 2\\U -V\\. 


( 81 ) 
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Since the trace is unchanged by tensoring with the identity, we have 


Hence 


® I) f 


< 2\\U -V\\. 


( 82 ) 


|| C — U || 0 = max 

p 


< max 
p 


(v ® i)p(y ® i)' 1 ' + Ew ® I )p(Vj ® I) f -(U® I )p(U ® I) f 


II (v ® i )p(v ® I) f - (u ® i)p(u ® n) t || i + 

< 4||?7 — V|| 


as claimed. 


(83) 

□ 


Proof of Lemma 8. For |m| < k, the values of a m differ from J m (z ) only due to the normalization 
factor in (42). The Bessel function J m (z) is bounded, for real z and integer m, by [1, 9.1.62] 


|«4m(^)| E 


1 


\m ! 


M 


(84) 


(here we use the fact that J- m (z ) = (—l) m J m ( 2 :) [1, 9.1.5]). Using this bound, together with the 
condition that \z\ < k, 


2 \Jm{z)\ E 2 


mi < v (i/2- i ^ k+1 

m i <A (k + i)\ f- [ / ) ~ (k + iy: 


(85) 


m=k +1 m=k -\-1 v ' m=k-\-l 

As a result, the normalization factor in (42) satisfies 

^ 1 


Y J m {z) >1 — 4 


\z/2f 


m=—k 


(& + !)!’ 


( 86 ) 


This means that a m closely approximates J m (z), in the sense that 

\z/2) k+l ^l 


— Jm(z') 


1 + 0 


(k + 1)! 


(87) 


Similarly, using \[J± — 1| < \vm\ and \z\ < k gives 


2 Y, 

m=k +1 


m- 


\z/2\ 


m 


S2M E 

m=k-\-l 

< 2\v\ Y m(l/2) m -( fc+1 ) 


(k + 1)! 
= 4|i/|(fc + 2) 


m=/c+l 

z/2| fc+1 


(fc + l)! 


( 88 ) 
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Using (41) gives 


1 = £ i). 


Therefore, with (88), we obtain 


/ 

£ 4(^M-l) = e--l + 0 , 

m=—k ' 

Using this expression together with (87) then gives 

E OminZ- 1 )- ^ 1 ' -^- (2/2)fc+1 


(z/2) 


fc+i 


e ,w -1 + 0 l v 


m=—k 

Using \e wz — 1| < \vz\ and \z\ < k gives 


k\ 


k\ 


1 + 0 


(z/2) 


fc+i \ 1 


(A: + 1)! 


E a rn {^-l) = e ivz -l + 0(u 


m=—k 

Because = 1, we obtain 


(z/2) 


fc+i 


k\ 


E CL m H± 


m=—k 


(z/2) 


fc+i 


k\ 


(89) 


(90) 


(91) 


(92) 


(93) 


Now Ylm=-k a ml jL ± are the eigenvalues of 14, and e wz are the eigenvalues of the desired unitary 
operation 14o- Using v = X/Xd, we have |z/| < ||i7||/Xd. Hence the norm of the difference of these 
operators is bounded as in (43). □ 

Proof of Lemma 11. First we bound the sum over small values of m. The Bessel functions satisfy 
[1, 9.1.76] 


£ = i. 


For any positive integer k we have 


E i^)i 2 <i. 

m=—k 

Using the Cauchy-Schwarz inequality, we find 


y'j \Jm( z )\ 4 


m=—k 


\ ^ \ 

1 m=—k \ 


y \Jm(z)\ 2 < V2k+1. 


m=—k 


ez 


Provided \z\ < k, using (85), together with l\ > (i/ef, we obtain 

fc+i 

2 E \ J m{z)\ <4 

m=k+1 

Combining (96) and (97), we hnd 

k +1 


2 (k + 1) 


S{z) < V2k + 1 + 4 


ez 


(94) 


(95) 


(96) 


(97) 


(98) 


2{k + 1) 

Finally, taking k = \ez/2~\, the second term is 0(1), which gives S(z) = 0(^/|z[) as claimed. □ 
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